Personal Name Resolution in Email: A Heuristic Approach

نویسندگان

  • Tamer Elsayed
  • Galileo Namata
  • Lise Getoor
  • Douglas W. Oard
چکیده

Much of the work to date on searching email has focused on personal information management. Archival access poses new challenges, including automatic association of references to unfamiliar individuals using whatever information is available about those people. This paper describes a computational approach to that task motivated by intuitions about the ways people might explore an email collection to find that information. The proposed approach makes use of context in a flexible and adaptive manner. Two techniques for context expansion are: a mixture model that combines evidence from each context to rank candidates, and cutoff model that ranks candidates based on the closest context in which any suitable evidence was found. Both models rely on mentions that could be resolved to a common identity as evidence of the resolution. Results on three relatively small collections indicate that the accuracy of our approach performs favorable compared to the best known technique and results on the full CMU Enron collection indicate that the approach presented in this paper scales well to larger email collections.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Modeling Identity in Archival Collections of Email: A Preliminary Study

Access to historically significant email archives poses challenges that arise less often in personal collections. Most notably, searchers may need help making sense of the identities, roles, and relationships of individuals that participated in archived email exchanges. This paper describes an exploratory study of identity resolution in the public subset of the Enron collection. Addressname and...

متن کامل

Corefrence resolution with deep learning in the Persian Labnguage

Coreference resolution is an advanced issue in natural language processing. Nowadays, due to the extension of social networks, TV channels, news agencies, the Internet, etc. in human life, reading all the contents, analyzing them, and finding a relation between them require time and cost. In the present era, text analysis is performed using various natural language processing techniques, one ...

متن کامل

Automated Email Integration with Personal Information Management Applications

An email analysis system that extracts calendar information automatically from text is presented. Appointment and meeting information is extracted using a summariser and named entity recogniser and presented to a PIM system as a structured record. Examples and evaluation results are presented. Email is one of the most ubiquitous applications used on a daily basis by millions of people worldwide...

متن کامل

Life Beyond the Mailbox: A Cross-Tool Perspective on Personal Information Management

Email interfaces provide poor support for the personal information management (PIM) activities that users have adopted them for. This paper reports a user study that highlights how two aspects of PIM, information management and task management, cut across a range of tools, including email. We argue that effective support for such cross-tool activities cannot be provided through a focus on one i...

متن کامل

Heuristic-based Korean Coreference Resolution for Information Extraction

The information extraction is to delimit in advance, as part of the specification of the task, the semantic range of the output and to filter information from large volumes of texts. The most representative word of the document is composed of named entities and pronouns. Therefore, it is important to resolve coreference in order to extract the meaningful information in information extraction. C...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008